Now we are in the second exercise session. Non-flat files were promised, so please load the GESIS Panel COVID-19 data.
This first part of the exercises only deals with importing data. Later, in the second exercise, we will turn more to non-flat data files and label them before exporting.
haven package.
library(haven)
gp_covid <-
read_spss("../data/ZA5667_v1-1-0.sav")
In contrast to the flat files, such as CSV, the variables now have labels. I wonder how the labels are for the first ten variables…
sjlabelled::get_label(your_data), but you have to make sure only to print the first ten variables.
library(sjlabelled)
##
## Attaching package: 'sjlabelled'
## The following objects are masked from 'package:haven':
##
## as_factor, read_sas, read_spss, read_stata, write_sas, zap_labels
get_label(gp_covid[1:10])
## za_number
## "Studiennummer des Archivs"
## version
## "Versionskennung und -datum des Archivs"
## doi
## "Digital Object Identifier (doi)"
## id
## "Befragten-ID"
## cohort
## "Rekrutierungskohorte"
## sex
## "Geschlecht"
## age_cat
## "Alter, kategorisiert"
## education_cat
## "Bildung, kategorisiert"
## intention_to_vote
## "Sonntagsfrage (gbzc011a)"
## choice_of_party
## "Sonntagsfrage Wahlentscheidung"
Unfortunately, it’s all in German. Imagine you are an education researcher, and you are interested in the variable education_cat. So you may want to consider translating the variable into English, right?
age_cat from “Bildung, kategorisiert” into “Education, categorized”.
sjlabelled::set_label() or do it in a pipe with sjlabelled::var_labels().
# either
gp_covid$education_cat <-
set_label(
gp_covid$education_cat,
label = "Education, categorized"
)
# or
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:sjlabelled':
##
## as_label
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
gp_covid <-
gp_covid %>%
var_labels(
education_cat = "Education, categorized"
)
# proof
get_label(gp_covid$education_cat)
## [1] "Education, categorized"
Your colleague asks you to provide your new data after changing labels and stuff. Unfortunately, she does not use R or SPSS and asks you to export your data as a Stata file.
haven package.
write_stata(gp_covid, "gesis_panel_corona_fancy_panels_final_final.dta")
## Tidying value labels. Please wait...
## Writing stata file to 'gesis_panel_corona_fancy_panels_final_final.dta'. Please wait...